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Abstract 


The  techniques  peculiar  to  dynamic  programming  have  found  a  variety 
of  successful  applications  In  the  theory  and  practice  of  modern  control. 
Successes  In  the  theory  and  practice  of  signal  and  Image  processing  are 
less  numerous  and  prominent,  but  they  do  exist.  In  this  paper  we  sound 
a  call  for  renewed  attention  to  the  potential  of  dynamic  programming 
for  solving  knotty  nonlinear  filtering  problems  In  signal  and  image 
processing,  and  outline  successes  we  have  recently  enjoyed  in  nonlinear 
frequency  tracking  and  random  boundary  estimation  in  noisy  black  and 
white  images.  Two  classical  results,  the  fast  Fourier  transform  (FFT) 
and  Levinson's  recursion  for  determining  autoregressive  parameters,  are 
treated  in  the  context  of  dynamic  programming  simply  to  reinforce  our 
view  that  many  of  the  algorithms  we  take  for  granted,  and  which  were 
derived  without  recourse  to  dynamic  programming,  can  be  nicely  Interpreted 
as  dynamic  programming  algorithms. 
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I .  Introduction 


In  a  recent  paper.  Will sky  [1}  staked  out  a owe  of  the  coamon  ground 
upon  which  specialists  in  control  and  signal  processing  stand.  A  1974 
paper  by  Kailath  [2]  achieved  a  similar  rapprochement  between  the  classi¬ 
cal  frequency  domain  techniques  associated  with  Wiener  filtering  and 
signal  processing,  and  the  time  domain  techniques  associated 
with  Kalman  filtering  and  modern  control.  Such  efforts  have  kindled 
interest  in  further  exploring  the  territory  consoon  to  the  respective 
communities  for  the  adventure  and  profit  that  is  in  it. 

In  this  paper  it  is  our  aim  to  further  stimulate  this  Interest  by 
showing  that  dynamic  programming,  a  fundamental  technique  in  control 
theory  since  Bellman's  introduction  and  advocacy  of  it  in  the  mid-1950's, 
can  be  of  considerably  more  value  in  signal  and  image  processing  than 
has  generally  been  recognized.  This  is  not  to  say  others  have  not 
recognized  the  potential  of  dynamic  programming  and  exploited  its  tech¬ 
niques  to  solve  interesting  signal  processing  problems.  We  mention  in 
particular  Viterbi's  dynamic  programming  algorithm  for  decoding  convolu¬ 
tional  code  sequences  [3],  Cahn's  dynamic  programming  algorithm  for  FM 
demodulation  [4],  and  Forney's  discussion  of  the  Viterbi  algorithm  and 
other  inference  problems  on  finite-state  Markov  sequences  that  can  be 
solved  with  the  techniques  of  dynamic  programming  [5]. 

In  the  sections  to  follow  we  re-derive  classical  algorithms  in 
discrete  Fourier  analysis  and  linear  prediction  using  the  principle  of 
dynamic  progr arming.  We  than  present  two  new  dynamic  programming 
algorithms.  One  is  for  nonlinear  frequency  tracking  and  the  other  Is 
for  edge  detection  in  noisy  black  and  white  Images. 

The  organization  is  as  follows.  In  Section  II  we  use  dynamic 
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programming  arguments  to  re -derive  the  Goertzel  and  deciaat ion- in-frequency 

FFT  algorithms  for  efficiently  computing  the  DFT.  In  Section  IV  we  discuss 
the  connections  between  control,  detection,  estimation,  and  prediction  of 
autoregressive  sequences  observed  in  additive  noise.  We  highlight  the 
central  role  played  by  the  so-called  normal  equations  and  re-derive  the 
Levinson  algorithm  for  recursively  solving  them  in  0(N  )  operations. 

Sections  V  and  VI  contain  the  new  results.  In  Section  V  we  derive  a 
dynamic  programming  algorithm  for  tracking  a  frequency  sequence  in  additive 
noise.  This  is  a  rather  classical  nonlinear  filtering  problem.  The  results  of 
Section  VI  show  how  dynamic  programming  may  be  used  to  derive  a  new  algo¬ 
rithm  for  estimating  local  segments  of  object  boundaries  in  noisy  black 
and  white  images. 
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II.  A  Dynamic  Programming  Formalin 
Traditionally,  dynamic  programming  has  been  uead  to  find  "optimum" 
solutions  to  multi-stage  decision  problems.  An  "optimum"  solution  has 
generally  been  one  that  maximizes  or  minimizes  a  performance  or  cost 
functional.  When  the  multi-stage  decision  problem  is  cast  in  a  prob¬ 
abilistic  framework  the  cost  functional  Is  typically  a  multivariable 
likelihood  function  or  some  monotone  function  of  it. 


Here  is  a  formalism  that  is  rich  enough  to  embrace  most  of  the 

"signal-in-noise"  problems  encountered  in  signal  and  image  processing. 

00 

Let  {x^}^  denote  a  process  with  state  variable  representation 

*k+l  "  fk(VV 
yk  "  MV 

Here  ffc  and  may  be  random  functions;  the  sequence  {ufc}  is  a  parameter, 
decision,  or  control  sequence  that  may  be  functionally  dependent  on  the 
measurement  sequence  {y^} .  The  range  spaces  for  the  state  x^,  the  param¬ 
eter  u^,  and  the  measurement  yfc  are  respectively  X,  U,  Y.  These  spaces 
may  be  finite,  countable,  or  noncountable .  When  the  spaces  X  and  U  are 
countable  then  their  respective  elements  may  be  placed  in  one-to-one 
correspondence  with  the  integers  and  the  formalism  of  Markov  chain  theory 
may  be  mined.  Even  though  the  states  of  X  may  appear  pretty  uninteresting 
(the  integers  0,1,2,...,),  the  mapping  g^  may  be  chosen  so  that  the  signal 
component  of  g^C-)  generates  characters  c^  that  are  of  great  interest.  The 
idea  is  simply  to  let  a  Markov  chain  control  the  dynamical  state  of  the 
problem  and  reserve  the  role  of  character  generation  for  the  observation 
mechanism  g^C-).  This  point  is  illustrated  in  Figure  1. 

As  an  example  of  character  generation,  consider  the  frequency  tracking 
problem  to  follow  in  Section  V.  The  state  x^  evolves  according  to  the  model 
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XkN  "  X(k-1)H  +  VkM  ’  VkN  e^0»1* • • • tQ"3 ^ 

Xk  “  Xk<modN)  5  xk  e{0,l,...,Q-l] 
where  {v^}  Is  a  sequence  of  lid  random  variables. 

Thus  as  k  runs  from  0  to  K,  x^  rests  in  state  m  for  N  steps,  jumps  to 
state  n  where  it  rests  for  N  steps  and  so  forth.  This  is  illustrated 
in  Figure  2.  The  character  generation  is  defined  by 


8k(xk)  ’  e 

So,  while  the  state  rests  in  state  m  for  N  steps,  the  sequence  of  charac¬ 
ters  is  a  rotating  sequence  of  phasors  as  illustrated  in  Figure  2. 

Now  consider  a  finite  version  of  the  process 

X**  *  (*0,xl . 

-  fn(xn“1,un_1) 

UN  =*  (Up,...,^) 

yn«  (y0,y1,...,yN) 

Typically  one  wants  to  maximize  a  performance  criterion 


ln(xn,un,yn) 

with  respect  to  U1*,  subject  to  constraints  CN(XN,UN)  =  0.  Call 
Lj^(X^,U,Y^)  the  maximum.  When  obeys  a  recursion  of  the  form 

■  hi-i<x”~1'uN"1’y!)  +  vvw  . 

then  dynamic  programming  comes  to  the  fore  and  the  solution  U  may  be 
generated  recursively  as  the  limit  of  the  following  sequence  of  solu¬ 
tions: 


Un  -  S  (Un-1,Yn)  ,  n-l,2,...,N 

n 

Thus  the  central  theme  is  to  imbed  the  solution  to  an  N  stage  problem 
in  a  sequence  of  simpler  K  stage  problems.  When  the  underlying  state 
and  parameter  spaces  are  finite,  the  solution  algorithm  is  finite¬ 
dimensional  and  implementable  on  a  digital  computer.  When  they  are 
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uncountable,  but  the  function  L  is  quadratic,  then  it  Is  still  often 
possible  to  find  a  closed  form  recursive  solution  that  may  be  pro¬ 
grammed  . 

A  very  large  class  of  problems  may  be  formulated  as  above.  Two 
particularly  noteworthy  examples  are  the  linear  discrete-time  quadratic 
regulator  problem  in  deterministic  and  stochastic  control,  and  Markov 
chain  sequence  estimation  in  additive  noise.  On  the  other  hand  there 
are  a  great  number  of  problems  that  admit  dynamic  programming  solutions, 
but  which  are  not  naturally  formulated  in  the  style  above.  The  FFT  and 
Levinson  algorithms  discussed  in  Sections  III  and  IV  are  well-known 
recursive  problem  solutions  that  have  been  derived  without  relying  on  a 
maximization  argument. 

One  of  the  points  we  wish  to  make  is  the  following:  recognizing 
that  a  solution  is  a  limit  of  a  sequence  of  approxlmants  that  may  be 
recursively  computed  is  perhaps  more  fundamental  than  the  search  for  a 
corresponding  optimization  problem.  The  chief  value  of  an  optimization 
formulation  is  that  it  often  simplifies  the  search  for  the  recursive 
solution  algorithm. 


III.  Dynamic  Programming,  the  DFT,  and  the  FFT 
The  DFT  certainly  constitutes  one  of  the  cornerstones  of  modern 
Fourier  analysis.  Its  uses  range  over  the  entire  spectrum  (so  to  speak) 

Ml 

of  signal  processing  applications.  The  DFT  is  a  mapping,  DFT:{x  }"  -*■ 

n  U 

{X  })?  ,  that  takes  the  sequence  {x  }!!  *  into  the  sequence  {X  }!?  * 

m  0  nO  mO 

according  to  the  rule 


x-  ■  "£  *»  C  •  "-0-1 . “-1 


TJ  2  0 

N  6 


k-0 
-j2iv/N 


,-mN 


Noting  that  W„  =  1  Vm,  we  may  write  X  as  follows: 

N  tn 

N-l  . 

X  =  l  x  W-m(N-n) 
m  n=0  nN 

This  calculation  may  be  viewed  as  the  limit  of  the  following  sequence 
of  Imbedded  approximations: 


X(k)  -  ‘'I1  x  W“m(k-n)  ,  k=l,2, . . .  ,N 

m  „  n  N 

n=0 

(k) 

Note  X  obeys  the  following  recursion: 

m 


,(k+l)  _  m  x(k)  +  w-m 

N  ra  N  x 

.  v(D. 


m 

XW)-  X 
m  m 


x  W 
IN 


This  is  the  so-called  Goertzel  algorithm  for  obtaining  the  mth  DFT 

variable,  X  ,  as  the  output  of  a  digital  filter  excited  by  the  sequence 
m 

{xn)Q  *■.  The  output  of  the  filter  is  read  at  time  k«=N.  See  Figure  3. 

Dynamic  Programming  and  the  Decimation-in-frequency  FFT 

The  Goertzel  algorithm  is  a  nice  dynamic  programming-like  solution 

for  the  DFT.  However  it  is  not  efficient.  Let's  see  if  we  can  improve 
(k) 

upon  it.  Consider  X  for  even  frequency  indices  m=2r: 


IV.  Detection,  Estimation,  and  Control  Structures 
in  the  AR(N)  Case:  Kalman  Filters,  Levinson 
Recursions,  and  Dynamic  Programming 

Autoregressive  (AR)  models  for  signals,  states,  and  data  play  a 
starring  role  in  many  areas  of  signal  processing  and  control.  By 
appropriately  selecting  model  parameters  (and  order)  one  can  model 
the  covariance  structure  and  spectral  characteristics  of  more  general 
models.  The  so-called  normal  equations  for  identifying  AR  parameters 
are  elegant  and  easily  solved  with  recursions  of  the  Levinson- type. 

In  this  section  we  tie  up  control,  prediction,  detection,  and 
estimation  in  the  special  case  where  we  are  dealing  with  a  zero-mean, 
wide-sense  stationary,  scalar  autoregressive  time  series.  The  usual 
state-variable  and  matrix  block  diagrams  give  way  to  scalar  variables 
and  digital  filter  blocks  of  moving  average  filters.  The  normal  equa¬ 
tions  are  high-lighted  and  dynamic  programming  is  used  to  derive  the 
famous  Levinson  recursions. 

Models 

Let  {x^}  denote  a  scalar  zero-mean,  wide-sense  stationary  autore¬ 
gressive  sequence  that  obeys  the  recursion 


N 

Z  a  x,  +  w,  V  k 
,  n  Tc-n  k 
n=l 


2 

w  :  sequence  of  i.i.d.  N(0,a  )  r.v.s. 
n  w 

It  is  easy  to  see  that  the  covariance  sequence  (r  rm 

ated  with  the  sequence  {x,}  obeys  the  recursion 


r  * 
m 


N 

Z 

n=l 


a 


n 


r 

m-n 


6 

m 


ra=0 , 1 , . . . , 


r  ,  associ- 
-m 


From  here  one  may  write  out  the  so-called  normal  equations: 
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ro  ri 

rN-l~ 

al 

rl 

rl  ro 

rN-l 

rl  *'■  rN-2 

rl 

ri  ro 

a2 

*N 

m 

r2 

rN 

- 

- 

-  - 

_  _ 

If  the  {a  },  are  known  these  equations  are  used  to  solve  for  the  {r  } 
n  ±  n 

N 

[6].  Conversely,  if  the  {r^Jg  are  known,  these  equations  are  used  to 
N 

solve  for  the  {a  . 

n  1 

In  much  of  what  follows  we  will  assume  the  sequence  {x^}  is  observed 
in  zero-mean  additive  WGN: 

2k  =  *k  +  "k  V  k 

2 

n^  :  sequence  of  i.i.d.  N(0,on>  r.v.s. 

The  companion  form  state  model  for  all  of  this  follows: 

Vl  *  A  \  +  bUk 

Z,  =  C'X, 


V 


Xk-N+1 

o  ! 

1 

1 

0 

Xk-1 

A  = 

o 

M 

. 

,  b=C= 

0 

*k 

*N  *N-1  *  *  *  a0 

1 

’  uk=wk+l 


Noisy  Prediction  and  the  Kalman  Predictor 

The  stationary  Kalman  one-step  predictor  for  the  noisily  observed 
AR(N)  sequence  is 

*k+l  =  A  \  +  K(zk_xk) 

Xk  =  C'*k 

where 

K  -  APC(C'PC  +  o^)-1 
“  » k>2 •  •  •  • * ] 


(1) 
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P  =  (A-K(r)P(A-KC')  +  o2  KK'  +  a2  bb' 

n  w 


The  Kalman  prediction  sequence  {xj^l  for  ^k+l^  can  interPreted 
as  the  output  of  an  ARMA(N,N-1)  filter,  driven  by  the  prediction  error 
sequence  or  an  ARMA(N,N-1)  filter  driven  by  the  observation 

sequence  {z^}.  The  resulting  filter  equations  are 


\+l  =  A  ai  Xk+l-i  +  gi\+l-i 


xk+l  =  (ai"8i)xk+l-i  +  if1  8i2k+l-i  (3) 

where  the  coefficients,  g^,  can  be  defined  by  the  characteristic 
polynomial  A  (A)  of  (A-KC"): 

N  ^  N-i 

A(X)  -  XN  +  I  (g  -a  )AN 

i*l  1  1 

See  Figure  5  for  a  block  diagram  of  this  predictor.  Note  that  the 

N 

noise-free  MA  predictor  filter,  P(z)  ~  £  a  z  ,  is  preserved 

n=l  n 

in  the  feedback  loop,  but  that  the  residual  sequence  v^'Z^x^  is  now 

weighted  with  a  feedforward  MA  filter,  Q(z)  =  ?-  gRZ  n. 

n=l 

Why  is  the  noisy  Kalman  predictor  ARMA  and  not  MA?  The  answer  is 
that  (z^),  a  noisy  version  of  an  AR  signal  process,  obeys  an  ARMA(N,N) 
difference  equation.  As  an  AR(N)  model  has  an  MA(N-l)  predictor,  it  is 
at  least  logical  (if  not  intuitive)  that  an  ARMA(N,N)  process  has  an 
ARMA(N,N-1)  predictor. 

2 

The  Noise  Free  Predictor  (a  =  0) 

n 


The  prediction  vector  consists  of  the  terms 
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E^xk-lW■l/^k-l,Zk-2,•*• 


1 


E[xk-l/zk-l,zk-2,,*‘I 

Elxk/zk-l,zk-2’”*  J 
2 

When  0^*0,  then  V  k  and 

ElVn/Vn*zk-n-l*”*J 

"  E[xk-n/\-n*”-]  “  *k-n  *  "-1.2.... 

So  In  this  case  the  prediction  vector  is 

*k-N+li 
Xk-N+2 

*k-l 

* 

*k/k-l 

It  follows  that  P,  the  covariance  EEX^-X^] [X^-X^] '  is 

|0  ...  0 
•  • 

•  • 

0 

0  ..  0  o" 


(4) 


Calculating  K  by  substituting  (4)  into  (1),  we  find  that  A(X)  “  X 


N 


and  hence  g^-a^.  This  implies,  as  one  would  expect,  that  the  prediction 
filter  (3)  reduces  to  the  purely  moving  average  relation 


N 

xk+l  =  1E1  ai  zk+l-i  * 

Minimum  Variance  Control 

One  of  the  simplest  control  strategies  is  minimum  variance  regula¬ 
tion  where  one  desires  to  minimize  the  variance  of  the  AR(N)  output 
sequence  {x^},  and  force  E(xk)-0.  The  well  known  separation  principle 


1 


*•  » 


allows  one  to  generate  a  feedback  control  strategy  assuming  noisefree 
measurements,  i.e.  n^»0,  and  then  use  the  same  strategy  in  the  noisy 
case  but  with  the  Kalman  filter  estimates  {x^}  replacing  the  actual 
filter  outputs  {x^. 

Assume  then  we  have  the  system 


N 

-  E  a. 


*k  “  ^  aiVi 


+  +  V,. 


where  {v^}  is  our  feedback  control  sequence.  We  would  like  to  minimize 

E(xk)  "  E(1f1  ai  Vi  +  \  +  vk)2 

-  E <w2  +  2wkvk  +  2W  (  E  a1xk_i  +  (vk  +  E  alVi)2) 

i-1  i“l 

-  E(w2)  +  2E(wkvk|vk)E(vk) 

N  N  , 

+  2E(wk(  r.  a1xk_1) )  +  EC(vk  +  a^^)  ). 

Since  {wk>  is  uncorrelated  with  i>l,  and  since  E (wkvk | vk)  *  0, 

2 

it  is  clear  that  E(xk)  is  minimized  by  choosing 
vk  =  -EVk-i 

This  control  is  illustrated  in  Figure  5  as  a  feedback  loop  running 
up  the  left  side  of  the  figure.  The  feedback  loop  to  the  top  COMPUTE 
box  shows  how  xk  would  be  used  for  minimum  variance  control  in  the  noisy 


Detection  and  the  Likelihood  Ratio 

Consider  the  hypothesis  test  Hq  vs.  with 

Ho  :  zk  “  "k  ’  fc-0’1*---* 

H1  :  zk  “  *k  +  ”k  ’ 

This  test  is  equivalent  to  the  test  vs.  where 
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N 


N 


lari  |  -  r  ,  n«l,2, . . . ,N 

.  m  n-m  n  »  »  » 

m-1  1  1 


These  equations  characterize  the  that  minimize  the  quadratic  form 

N  N  N 

Q„  *  r.  -  2  I  ar  +  E  E  aa  r  i  i 

N  0  -mm  ,  ,  m  n  n-m 

m»l  m**l  n=l  1  1 

The  minimum  is 


_N,  .  ^  N 

QnCV—V  "  r0  -  E.am  rm 

m»l 


(4b) 


Assuming  the  normal  equations  to  be  non-singular,  we  may  write 


N  N  N 

“m  =  Smn  rn  ’  m“1’2’"*’N 

n=l 


(5) 


N  ^  ^  N 

QN(r1,...,rN)  -  r  -  E  E  rm  Smn  r 

m»l  n°l 


(6) 


Write  out  QN(0  as  follows: 


N-l 


Tl  1* 


0  ■ 

•  ^  U  l  4-1 

..  mm 

m=l 

N-l 

!  E 

Vn  r|N-m| 

m“l 

2 

lNr0 

-  2unrN  +  Q, 

N-l  N-l 

tv,r„  +  E  E  a  a  r  |  i 
N  N  ,  ,  m  n  m-n 

m=l  n=l 


L  2 
+ 


So  the  minimization  of  Q„(r., . . ,,r„)  with  respect  to  {r*  },  leads  to 

N  1  N  ml 


Q^(rlt...,rN)  -  min  Q^r^  . . .  ,rM> 
(a 

m  1 


min[V0  -2aNrN  +  min„  ,  ^N-l  ^rl_aNrN-l*  *  *  *  ^ 

°N 


i  iN-1 
m  l 


min[aNro  ~2aNrN  + 
aN 


(7) 


This  equation  contains  the  essence  of  dynamic  programming  and  the 
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4  ■ 


N-l  N-l  jj_1 

principle  of  optimality;  Once  the  {a^, . . .  are  found, 

constructed  and  found  as  a  function  of  rQ»r{j  and  . ,a^_* J 

continues  in  this  way.  At  each  step  of  the  way  the  minimization 

on  QN  _^(0  is  quadratic. 

Let's  use  (6)  in  the  RHA  of  (7): 

Q*(rl,...,rN)  =  min[^r()  -2^  +  rQ 

“N 


N-l  N-l 


,N-1 . 


E.  ^rm-aNrN-m^  ^mn  ^V^V-n^  1 

m=*l  n=l 


“n 


N-l 

N-l 

N- 

I  -  * 

Z 

r .  S 
N-m  mn 

m=l 

n«l 

N-l 

N-l 

„N-1 

Z  r 

I 

S  rM 

i  m 
m=l 

n«l 

mn  N 

N  N 

+  QN-l(ri’  •  "’rN-l^ 

It  follows  easily  that  the  minimizing  value  of  is 

N-l  N-l  N-1 

rN  E  2  rmSmn  rM-n 

N  _  _ m=l  n=l _ 

“n  N-l  N-l  „  . 

r_  -  I  Z  r„  S  1  r„ 

0  ,  .  N-m  ran  N-n 

m=l  n=l 


Use  (5)  in  the  numerator  to  get 

N:1  N-i 

M  rN  l.  an  rN-n 
N  n=l 


“N 


N-!  N-l  N-i 

r0  “  E1  rN-mSran  rN-n 
m=l  n=l 


Let's  call 


°n  *  1  rN-m  SL1  ’  n=1»2»-- -.N"1 

m~i 


may  be 
.  One 
problem 


and  see  if  we  can  find  a  recursion  for  it.  In  the  meantime 
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l.  Cm  r|n-mj  "  rN+l-n  '  n‘*1’2, * • * »N_1 


(9) 


It  therefore  satisfies  a  recursion  just  like  a  .  In  fact  (9)  is  just 

m 

(Aa)  with  the  RHS  vector  turned  upside  down.  By  the  symmetry  of 

R  *  {ri  |  },  this  simply  turns  the  solution  vector  {ctN}  upside  down. 

I  m-n  I  in 

Thus 

JJ  N  .  _ 

Cm  “  aN+l-m  ’  m“1»2****»N 

and  we  are  done. 

Summary;  The  Levinson  recursions  may  now  be  written 
V  N-l 

„  rN  1  “n  rN-n 
N  _  n»l _ 

“N'  "r1  n-i 

r.  -  I  a„  r„ 

0  ,  N-n  N-n 

n=l 


N  N-l  N  N-l  .  -  . 

am  ~  “m  “  aNaN-m  *  1,2,..., N-l 

N  N-l  N.  N“1  N-l  . 

Q„  =  Q„  ,  -a„(r„  -  E  a  r_,  ) 

N  N-l  N  N  ,  m  N-m 

ra*l 

N 

Call  a  *  a  ,  n=l,2,...,N  to  have  the  solution  to  the  normal  equations 
n  n 

given  in  (A) .  To  get  these  equations  into  their  fully  modem  form,  one 
must  look  at  a  backwards  prediction. 


V.  Frequency  Tracking  and  Dynamic  Progranming 
Phase  and  frequency  Cracking  problems  comprise  some  of  the  most 
nettlesome  nonlinear  filtering  problems  in  the  entire  realm  of  signal 
processing  and  conmunication.  A  typical  problem  is  the  following: 
observe  the  sequence  {z^}  with 

2k  "  sk  +  nk 

Kk 

sk=  e 

j“kk 

and  estimate  the  FM  sequence  {ujj.}.  Here  s^  =  e  is  a  randomly  fre¬ 
quency  modulated  signal.  The  sequence  {n^}  is  additive  noise  and  {u>k) 

2 

is  a  sequence  of  angular  frequencies.  Assume  n.  :N(0,o  )  (complex  normal) . 

We  wish  to  observe  a  record  of  data  {z^}^1  and  infer  the  most  likely 

A  KN 

sequence  of  frequencies,  {“j^q  •  For  our  m°del  of  we  take  each 

toke{0,2u/Q, . . . ,  2ir(Q-l) /Q),  evolving  according  to 

“kN  =  “(k-l)N  +  VkN 
^kN+Jl  =  wkN  ’  1=0.1»---»N-1  V  k 

A  typical  sequence  {u^}  is  illustrated  in  Figure  5.  The  independent 
increments  sequence  {v^}  is  a  sequence  of  i.i.d.  r.v.s.  selected  in  such 
a  way  that  the  transition  probabilities 

p(“kH/“(k-l)M) 

correspond  to  our  notion  of  physical  reality.  We  may  think  of 


the  sequence  as  a  finite-state  random  walk  on  the  circle  with  an 

unusual  transition  probability  structure.  Typical  trajectories  for  { oj.  } 
j0Jkk 

and  {s.  =  e  }  are  illustrated  in  Figure  5. 


Let's  organize  the  sequence  {z^Iq  into  contiguous  blocks  of  length 
N,  of  the  form 


See  Figure  6.  Recall 
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ZkN+4  "  e  nkN+4 

+  <w 

K— 1 

V>J  A  M_1  IfM 

Write  {z.}q  “  u  {zkN+i*4-0*  Consider  the  joint  density  of  {z^q 

k-0 

311(1  {wkN}0~1: 

f(  U  *zkN+i}0  ’  {u)kN}0  * 

k-0 


Exploit  the  probabilistic  structure  of  {z^}  and  to  write  this  as 

kN/“(k-l)N) 


K-l  N-l  joi  (kN+4)  ,  K-l 

f(*,*)  *  IT  it  N  (e  ,a")  IT 


k=0  4=0  kN+4  “  k-0 

The  natural  logarithm  of  f(*,*)  is  proportional  to 


K-l  N-l 

£nf  (•,  •)”  2Re  —j  Z  Z  e 


-jWkN(kN+i) 


2 a  k-0  4-0 
n 


K-l 

+  ^  In  P(wkj}/h>(k_1)N) 


Write 


K-l  -ju>.NkN  N-l 

inf(-,-)  '  2Re  ~  *  e  S  zWN+lle  + 

2o  k-0  4=0 

n 


N-l 

+  ^  4np(u,kN/W(k_1)li) 

Let  XkN(0)  denote  the  finite  Fourier  transform 

J94 


N-l 

XkN(0)  ”  Z  ZkN+4 
1=0 


Then  the  log- likelihood  function  may  be  written 

XkN(0‘WkN) 


*%  I*  e 


2o  k-0 
n 

N-l 

+  ^  ^P^kN^Ck-l)^ 


4nf (  • ,  • )  -  Re 
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~  XN 

Our  notion  of  the  most  likely  sequence  {ci^q  Is  the  sequence 


kN+i 


kN 


Jl-0,1,2 . N-l 


k-0,1 . K-l 


,K-1 


where  is  the  sequence  that  maximizes  Thus  we  consider 


the  maximization  problem 
1 


max 

,  iK-1 
{aikN  0 


K-l  -j^kN 
Re  £  e 


k=0 


K-l 

XkN(0=a)kN)  +  kfQ  <lnp(“kN/a'(k-l)N) 


Write  this  as 


max 

r  i*-1 
{a,kN}0 


K-l 


with  T  satisfying  the  following  recursion 
+  Hnp(iu  /a) 


rt  =  rt-l 


1 

tN'“'(t-l)N)  +  ^2  **  6  XtN(0=a’kN) 

n 


So  our  maximization  problem  becomes 

max  „  .  [  max  v  ,  rK-2+£nf(w(K-l)N/w(K-2)N)  + 

‘"WiS  ‘"kN'o 


X(K-1)N(0  “  (K-l)N^ 


This  form  leads  to  the  following  observation:  the  maximizing  frequency 
trajectory,  call  it  {<!„},  passing  through  on  its  way  to  “(k-DN’ 


must  arrive  at  u(K_2)n  al°n8  a  route 


K-3 


that  maximizes  r„ 


If  it  did  not  we  could  retain  w(K_2)n  W(K-1)N 


K-2* 

and  with  a  different 


sequence  to  get  a  larger  rK_^.  It  is  this  observation  that  forms  the 

basis  of  forward  dynamic  programming. 

Recall  the  w.  „  e{2iTr/Q}^  This  means  the  finite  Fourier  trans- 
kN  r*»0 

form  5(^(9)  must  only  be  evaluated  on  0=2Tir/Q,  r*»0,l, . . .  ,Q-1 .  The  best 

N-l 

way  to  do  this  is  to  zero-pad  to  obtain  a  Q-point  sequence 


that  can  be  FFT'd  to  get  Xj^(0"2irr/Q) .  See  Figure  5.  Then  for  each  node 
on  Figure  5  we  evaluate  XjtN(0=2irr/Q) ,  and  find  the  best  route  through  the 
trellis  with  a  dynamic  programing  algorithm.  This  completes  our  algorithm 
for  moderating  the  usual  peak-picking  rule  on  the  FFT  with  prior  informa¬ 
tion  P(WkN/U(k_1)N)  . 

The  reader  is  referred  to  [9]  for  a  more  complete  discussion  of  a 
related  algorithm  for  nonlinear  phase  tracking. 
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VI.  Local  Boundary  Estimation  in 
Noisy  Black  and  White  Images 

In  digital  Image  processing  one  is  interested  in  developing  computer 
algorithms  which  can  either  automatically  extract  information  from  pictures 
or  at  least  simplify  the  process  of  manually  interpreting  them.  In  either 
case,  a  basic  step  involves  segmenting  a  picture  into  regions  with  similar 
features  such  as  grey  level  or  texture.  This  involves  the  estimation  of 
region  boundaries.  Boundary  estimation  algorithms  make  use  of  operators 
which  estimate  short  segments  of  boundaries  using  picture  data  in  small 
picture  sections.  Examples  are  simple  gradient  operators  or  the  well 
known  Hueckel  operator  Q.0] .  An  example  of  a  local  sequential  estimator 
which  is  also  used  for  this  purpose  can  be  found  in  [11] .  In  this  section 
we  outline  a  new  dynamic  programming  algorithm  for  sequentially  estimating 
short  boundary  segments. 

Image  and  Boundary  Models 

Let  a  digitized  black  and  white  image  be  represented  by  a  matrix 
with  components  corresponding  to  the  grey  level  value  of  a  picture 
element  (pixel)  centered  at  position  (i,j) .  The  value  g^  will  have  two 
components  -  a  true  picture  component  and  a  noise  component  n^  so 


that  g 


ij 


b, .  +  n,,.  A  picture  is  assumed  to  consist  of  a  single  region 
ij  ij 


of  grey  level  r^n  lying  in  a  background  of  grey  level  rout>  so  that  by 

can  take  on  either  of  the  two  values  ra  or  r  The  noise  components 

in  out 

n^  are  assumed  to  be  independent  identically  distributed  Gaussian  random 

2 

variables  with  mean  zero  and  variance  o  . 

An  edge  element  is  defined  as  the  line  segment  separating  two 

adjacent  pixels,  and  as  shown  in  Figure  8  a  boundary  segment  consists 

M 

of  a  directed  sequence  of  edge  elements  {t^}^. 


As  illustrated  in 
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Figure  9  ,  we  model  short  boundary  edge  sequences  as  being  generated 
sequentially  by  passing  out  of  rectangular  boxes  R^,  k=*l,2, . . . ,N,  of 
size  =  kx2(K-l) .  Figure  10a  gives  an  example  of  a  boundary  which 
is  consistent  with  this  model  while  Figure  10b  shows  a  similar  but 
inconsistent  boundary.  In  the  latter  case  the  edge  sequence  re-enters 
B^.  Although  this  model  restricts  somewhat  the  types  of  boundary 
segments  that  can  be  generated,  it  is  still  very  reasonable  for  region 
boundaries  with  low  and  slowly  varying  curvatures  such  as  the  Cat  Scan 
Lung  Section  shown  in  Figure  11 .  The  maximum  rectangle  size  N  is 
assumed  fixed  a  priori  and  is  a  function  of  the  boundary  curvature 
properties  for  the  region  of  interest. 

Boundary  segments  generated  by  such  a  model  are  naturally  represented 
by  a  sequence  of  states  in  a  Markov  chain  where  the  index  parameter  k 
for  the  rectangle  size  is  also  the  index  parameter  for  the  Markov 
process.  A  process  state,  x^,  at  "time"  k,  k  ^  1,  will  correspond 
geometrically  to  the  end  point  of  a  boundary  sequence  passing  out  of 
R^.  Figure  12  shows  the  state  locations,  x^,  for  k=l,2,...,5.  Note 
that  the  number  of  possible  states  at  time  k  is  1  for  k=l,  3  for  k*2, 
and  9  +  4(k-3)  for  k  ^  3. 

Figure  13  contains  an  abstract  representation  of  a  typical  realiza¬ 
tion  of  the  Markov  process,  together  with  a  description  of  the  picture 
or  character,  c^,  associated  with  each  state.  The  observed  image  will 
be  a  noise  corrupted  version  of  each  such  picture. 

If  the  regions  of  interest  have  smooth,  low  curvature  boundaries 
then  a  reasonable  rule  for  assigning  transition  probabilities  p | 
is  to  choose  p(XjJx^  to  be  inversely  related  to  the  distance  (measured 
in  edge  elements)  between  states  x^_^  and  x^.  We  must  also  impose  the 
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9+4(k-3) 

constraint  that  Z  p(jt^|x^  )  *  1. 

J = ^ 

Using  this  model,  a  boundary  edge  sequence  { }"  will  correspond 
N 

to  a  state  sequence 

A  Dynamic  Programming  Algorithm  for  Fitimating  Boundary  Segments 


Using  the  pixel  data  in  an  Nx2(N-l)  block,  R^,  we  next  formulate 
a  dynamic  programming  algorithm  for  optimally  estimating  edge  sequences 
that  are  generated  by  the  previously  described  model.  The  algorithm  is 
optimal  in  the  sense  that  it  maximizes  the  joint  likelihood  of  an  edge 
sequence  through,  and  all  the  data  in,  the  rectangle  R^. 

To  begin  we  first  define  the  pixel  data  sets 

Dk  =  {gij  :  pixel 

dk  =  {gij  :  Pixel  pixel  (i, j)  ^R^}. 

This  implies  that  =  D^^Ud^.D^3  empty  set.  This  recursion  is 

essential.  Next  let  S.nL(-)  denote  a  log- 

N 

likelihood  function,  and  =  (x^)-^  denote  a  boundary  state  sequence  of 
length  N.  Then  HnL(D^,S^X  the  joint  log-likelihood  of  a  boundary  state 
sequence  and  the  picture  data, must  satisfy 

anL(DN’V  a  *nL(DNiv +  *nL<y  w 

where  £nL(SN)  is  the  log -likelihood  of  the  state  sequence  and 

£n(DN|SN)  is  the  pixel  data  log-likelihood  conditioned  on  the  boundary 
M 

{t^}^  described  by  S^.  Since  the  state  sequence  S^  is  a  Markov  chain  we 
can  use 

HnL(SN)  =  !lnL(SN_1)  +  fa^xjx^j)  (10) 

inL(S^)  =  ^nL(x^)  =  £npg(x^)  (11) 

where  Pg(x^)  is  the  probability  of  a  particular  starting  state  x^. 

Since  boundary  edge  sequences  are  prohibited  from  re-entering  rectangles 


from  which  they  have  already  passed  out  of,  we  can  express 


(12) 


inL(DN|SN)  -  taUD^IS^)  +  InL^I^) 

where  InL(dgjx^)  Is  the  log-likellhood  of  the  data  added  In  extending 

the  state  sequence  to  conditioned  on  the  specific  new 

state  x^.  Substitution  of  (12)  and  (10)  into  (9)  leads  to  the 

following  recursive  expression  for  L(DU»SU): 

inL(DN,SN)  »  lnL(DN1,SN1)  +  tnp(xN|xN1)  +  AnL(dN  1^)  -  (13) 

The  transition  probabilities,  p(Xjt|x^_^) ,  can  be  calculated  using  a 

distance  rule  such  as  the  one  discussed  above,  while  incremental  data 

log-likelihoods,  £nL(d^|x^)  can  be  calculated  by  observing  that  the 

2 

pixel  grey  level  values  g^  are  N(rin»a  )  lies  inside  the  region 

2 

and  N(r  , o  )  when  g.,  lies  outside  the  region.  Furthermore  once  x 
out  °ij  k 

has  been  specified,  all  pixel  values  in  d^  can  be  associated  with 
pixels  either  inside  of  or  outside  of  the  region.  Hence  if  we  define 


PG(x) 


— —  exp(-x2/2o2) 


^■no2 


we  can  use 


S.nL(dk|xk)  -  E  ^?G{g  -rin)  +  E  ^G(«1J-tout) 

8Uedk  gijEdk 

(i,j)  inside  (i,j)  outside 

region  region 


*  C  -  Z 


<kij-rin,2/2'’2  -  1  .  <8irro»t,2/2°2 

8<<e,1k 


(i,j)  inside 
region 


(i,j)  outside 
region 


(14) 


where  C  is  a  constant  which  is  independent  of  the  choice  of  x^. 

Finally,  a  dynamic  programing  algorithm  for  estimating  a  state 

H 

sequence  and  hence  a  boundary  edge  sequence  which  maximizes 

inL(DN,SN)  can  be  derived  by  observing  that 
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max  HnL(DN,SN)  -  max  [max  InL  (D^ ,  )  +  tnptxjjx^) 

Xn  SN-1 

+  *nL(dNIV>  (15) 

This  completes  the  specification  of  the  algorithm. 
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Figure  Captions 


Figure  1.  States  and  Characters 

Figure  2.  Sunflowers,  States,  and  Characters  for  Frequency  Tracking 

Figure  3.  Goertzel  Filter  for  DFT  Component  X 

01 

Figure  4.  Four  Point  Decimation- in-Frequency  FFT 

Figure  5.  Prediction,  Detection,  and  Control  of  a  Noisy  AR(N)  Sequence 

Figure  6.  Visualizing  the  Random  Walk  Frequency  Trajectories 

Figure  7.  Date  Processing  and  Frequency  Trellis  Illustrating  Evolution 
of  Surviving  Frequency  Tracks 

Figure  8.  A  Boundary  Segment  in  Small  Picture  Segment 

Figure  9.  Example  of  Boundary  Segment  Generation 

Figure  10a.  Example  of  a  Boundary  Consistent  with  Model 

Figure  10b.  Example  of  a  Boundary  Inconsistent  with  Model 

Figure  11.  CAT  Scan  of  Lung  Section 

Figure  12.  Possible  State  Locations  x^  for  k*l,2,...,5 

Figure  13.  State  Transition  Diagram  for  k=l,2,...,5  Illustrating  a  Set 

of  Characters  C^,  k=l,2,...,5  for  a  Specific  Process  Realiza¬ 
tion  x^,  k=l,2,...5 


Close  for  Control 


Figure  9.  Example  of  Boundary  Segment  Generation 


Figure  10(a) .  Example  of  a 

Boundary  Consistent  with  Model 


Figure  10(b) .  Example  of  a  Boundary 
Inconsistent  with  Model 


